Evaluating alignment quality between iconic language and reference terminologies using similarity metrics
نویسندگان
چکیده
BACKGROUND Visualization of Concepts in Medicine (VCM) is a compositional iconic language that aims to ease information retrieval in Electronic Health Records (EHR), clinical guidelines or other medical documents. Using VCM language in medical applications requires alignment with medical reference terminologies. Alignment from Medical Subject Headings (MeSH) thesaurus and International Classification of Diseases - tenth revision (ICD10) to VCM are presented here. This study aim was to evaluate alignment quality between VCM and other terminologies using different measures of inter-alignment agreement before integration in EHR. METHODS For medical literature retrieval purposes and EHR browsing, the MeSH thesaurus and the ICD10, both organized hierarchically, were aligned to VCM language. Some MeSH to VCM alignments were performed automatically but others were performed manually and validated. ICD10 to VCM alignment was entirely manually performed. Inter-alignment agreement was assessed on ICD10 codes and MeSH descriptors, sharing the same Concept Unique Identifiers in the Unified Medical Language System (UMLS). Three metrics were used to compare two VCM icons: binary comparison, crude Dice Similarity Coefficient (DSCcrude), and semantic Dice Similarity Coefficient (DSCsemantic), based on Lin similarity. An analysis of discrepancies was performed. RESULTS MeSH to VCM alignment resulted in 10,783 relations: 1,830 of which were manually performed and 8,953 were automatically inherited. ICD10 to VCM alignment led to 19,852 relations. UMLS gathered 1,887 alignments between ICD10 and MeSH. Only 1,606 of them were used for this study. Inter-alignment agreement using only validated MeSH to VCM alignment was 74.2% [70.5-78.0]CI95%, DSCcrude was 0.93 [0.91-0.94]CI95%, and DSCsemantic was 0.96 [0.95-0.96]CI95%. Discrepancy analysis revealed that even if two thirds of errors came from the reviewers, UMLS was nevertheless responsible for one third. CONCLUSIONS This study has shown strong overall inter-alignment agreement between MeSH to VCM and ICD10 to VCM manual alignments. VCM icons have now been integrated into a guideline search engine (http://www.cismef.org) and a health terminologies portal (http://www.hetop.eu).
منابع مشابه
The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کاملA Novel Image Structural Similarity Index Considering Image Content Detectability Using Maximally Stable Extremal Region Descriptor
The image content detectability and image structure preservation are closely related concepts with undeniable role in image quality assessment. However, the most attention of image quality studies has been paid to image structure evaluation, few of them focused on image content detectability. Examining the image structure was firstly introduced and assessed in Structural SIMilarity (SSIM) measu...
متن کاملUsing Collocations to Assess MT Quality
Conventional metrics for Machine Translation evaluation have focused on using n-gram similarity between a reference translation and a system translation as an indication of the system quality. A simple n-gram model however cannot capture long-distance dependency, and the requirement of a reference translation has prevented the use of these metrics at the decoding stage. In this paper we propose...
متن کاملReordering Metrics for Statistical Machine Translation
Natural languages display a great variety of different word orders, and one of the major challenges facing statistical machine translation is in modelling these differences. This thesis is motivated by a survey of 110 different language pairs drawn from the Europarl project, which shows that word order differences account for more variation in translation performance than any other factor. This...
متن کاملEvaluating Word Order Recursively over Permutation-Forests
Automatically evaluating word order of MT system output at the sentence-level is challenging. At the sentence-level, ngram counts are rather sparse which makes it difficult to measure word order quality effectively using lexicalized units. Recent approaches abstract away from lexicalization by assigning a score to the permutation representing how word positions in system output move around rela...
متن کامل